Role 


Project sponsor 
Client 


Data Scientist 


Data Architect 


Operations 


Responsibilities 


Represents the business interests; champions the project 
Represents end users' interests; domain expert 

Sets and executes analytic strategy; communicates with sponsor and client 
Manages data storage; sometimes manages data collection 
Manages Infrastructure; deploys final project results 


Meet with project sponsor and stakeholders 
to collect ideas and relevant background Information 


□ What are the constraints that have to 
be met for successful deployment? 


□ What are they doing to solve the problem now, 
and why isn't that good enough? 


Determine if the specified goals are actually going 
to make business sense and whether 
you have data and tools 
to achieve all aspects 


° Why do the sponsors want 
the project in the first place? 


Questions that must 
be answered: 


° What resources are needed: 
what kind of data and how much staff? 


a What do they lack, and what do they need? 


° What are the computational 
resources? 


□ Domain experts to 
collaborate with? 


Do we have all the 
neccesary information 
to proceed? 


IF NO 


IF YES 


Define the project 
goal as coming up 
with a candidate hypothesis 


Define the precise 
goal of the project 


CONCRETE REQUIREMENTS 


BAD GOALDEFINmON 


GOOD GOAL DEFINfTION 


Specify concrete requirements 
of hypothesis 


"We want to get better at finding bad loans. 


CONCRETE STOP CONDITIONS 


UNCLEAR GOAL LEADS TO 


"We want to reduce our rate of loan 
charge-offs by at least 10%, using a 
model that predicts which loan 
applicants are likely to default." 


Decide on concrete stop conditions, 
such as a time limit 


Miscommunication, 
wasted resources, 
all parties unhappy 


CONCRETE GOAL LEADS TO 

I 


Concrete stopping conditions and 
concrete acceptance criteria 


BAD OUTCOME 


PROJECT FAILURE 


RESULTS IN 

i 


Big picture problem that can be 
broken down into sub-problems which are: 

- Specific 

- Measurable 

- Achievable 

- Relevant 

- Time-bounded 


GOOD OUTCOME 


PROJECT SUCCESS 


Identifying the data you need, 
exploring It, and conditioning 
It to be suitable for analysis 


Discoveries that call for refinement of project goals: 
Need other types of information 

Data isnt suitable for the problem 

Things in the data that raise issues more Important 
than the one you originally planned to address. 



■USED TO 


Extract useful insights from the 
data In order to achieve your goals 


Often back and forth between 
model building stage and management stage 


Key Questions: 



clean the data: 
repair data errors and 
transform variables 




T 


Deploy small scale test pilot 


t 



Determine if model meets your goals 


Is it accurate enough for your needs? Does it generalize well? 


Does it perform better than "the obvious guess"? Better than whatever estimate you currently use? 


Do the results of the nrjodel (coefficients, clusters, rules) make sense in the con- text of the problem domain? 


IF NO 


Change modeling approach 


IF STILL NO 
^ 


IF YES 


Report results 



Data doesn't support the 
goal you're trying to achieve 


Need to define more redlistic goals 


Need to gather additiondl data or other resources 
that you n€?ed to achieve your original goals 



T 


Present results to your project 
sponsor and other stakeholders 


Ensure that the model will run smoothly 


Make sure that the model can be updated 
as its environment changes 


Present documentation to those responsible 
for using, running, and maintaining the 
model once it has been deployed 


Presentation for the model's end users should convey how the 
model will help them do their job better: 


I 


I 


Modify If neccesary to correct any 
unexpected results 


T 



Documentation for operations staff should 
emphasize the impact of your model on 
the resources that they're responsible for. 


Model is put into operation 


